M. Wyatt Toure1*, Simon M. Reader1

1McGill University, Department of Biology, 1205 Docteur Penfield, Montreal, Quebec H3A 1B1, Canada

*Corresponding Author:
M. Wyatt Toure
e-mail: m_wyatt.toure@mail.mcgill.ca

Date of last update: May 25 2021


Brief Overview

This page reports the analyses for the experiment described in ‘Colour biases in learned foraging preferences in Trinidadian guppies’. The raw data used to conduct these analyses are available in the colour-learning-project-data.csv file. Descriptions of the variables found in the dataset are given in the metadata section of the README file. The R script to reproduce the analysis and this site are in the analysis.Rmd file. The methods used to produce this data can be found on the Methods page.


Models

We analysed the data from our experiment using linear mixed effect and generalized linear mixed effect models with the lmer() and glmer() functions from the lme4 package. P-values and effective degrees of freedom were obtained using the lmerTest package. Model residuals were checked they met distributional assumptions with the DHARMa package, you can click the ‘See Model Residuals’ button below the model formulas to see the residual diagnostic plots produced by DHARMa for that particular model.


Model 1 - Preference for the green object at baseline

This first model contains the data for all individual guppies. I looked at the green object preference of all guppies in an intercept only model to see if the green object preference at baseline was significantly different from zero. green.object.preference is the time spent near the green object subtracted by the time spent near the blue object

baseline_data_model <-
  lm(green.object.preference ~ 1,
    data = baseline_data
  )


Result
Factor Estimate Std. Error T statistic P value
Intercept 1.065 3.008 0.354 0.727

During the initial test, there was no significant preference for the green object across all guppies (p = 0.727).

Preference for the green object relative to the blue object across all guppies at baseline. Negative values represent more time spent with the blue object, positive values indicate more time spent with the green object. Data are means +/- 95% CI

Figure 1: Preference for the green object relative to the blue object across all guppies at baseline. Negative values represent more time spent with the blue object, positive values indicate more time spent with the green object. Data are means +/- 95% CI


Model 2 - Preference for the rewarding object during training

To see whether fish were responsive during training our second model asks whether the preference for the rewarding object changes throughout training and whether the change in rewarding object preference is different between the treatments.

  • Response variable: rewarding.object.preference is the time (seconds) spent near the rewarding object subtracted by the time spent near the unrewarding object
  • Fixed effect: rewarding.object.colour is the identity of the rewarding object (blue or green)
  • Fixed effect: trial is the number of the training trial. In this model it is supplied as an integer
  • Random effect: id is the identity of the individual fish
training_data_model <-
  lmer(rewarding.object.preference ~ rewarding.object.colour * trial + (1 | id),
    data = training_data
  )

There is a slight deviation in the upper quantile but no indication in the residual plot of a gross model misfit.


Results
Factor Estimate Std. Error T statistic df P value
Intercept -11.017 19.490 -0.565 48.463 0.575
Reward object colour 55.989 27.341 2.048 47.159 0.046
Trial 7.700 1.085 7.098 438.052 < .001
Rewarding object colour X Trial -0.549 1.490 -0.368 434.792 0.713

There was a significant effect of trial. Over the 20 training trials, guppies in the two treatments increased their relative preference for their respective rewarded objects by 7.7 seconds each trial (Figure 2, p < .001). There was also a significant effect of rewarded-object colour (p = 0.046). during training green-rewarded guppies expressed a stronger preference for their rewarded object (the green object) than did blue-rewarded guppies did for the blue object. However, there was no significant interaction effect between rewarding object colour and trial (p = 0.713), i.e., the rate of increase in object preference over trials did not significantly differ between the treatments.

Relative preference for the green object in both treatments during training trials (trials 1-20). Negative values represent more time spent with the blue object, positive values indicate more time spent with the green object. Light lines connect individuals across trials and bold lines represents a linear fit with 95% CI (grey shading). Subjects were consistently rewarded for approaching the blue object (dashed lines) or the green object (solid lines).

Figure 2: Relative preference for the green object in both treatments during training trials (trials 1-20). Negative values represent more time spent with the blue object, positive values indicate more time spent with the green object. Light lines connect individuals across trials and bold lines represents a linear fit with 95% CI (grey shading). Subjects were consistently rewarded for approaching the blue object (dashed lines) or the green object (solid lines).


Model 3 - Preference for the rewarded object during testing depending on treatment

For the main effects of training and rewarding object colour on rewarding object preference I fit a generalized linear mixed effects model with a Gaussian distribution which modeled the variances to account for variance heterogeneity using the package glmmTMB. My third model asks whether the preference for the rewarding object changed between baseline and final test and looks for an interaction with rewarded object colour.

  • Response variable: rewarding.object.preference is the time (seconds) spent near the rewarding object subtracted by the time spent near the unrewarding object
  • Fixed effect: rewarding.object.colour is the identity of the rewarding object (blue or green)
  • Fixed effect: trial is the number of the training trial. In this model it is supplied as a factor where 0 is the baseline trial and 21 is the final trial
  • Random effect: id is the identity of the individual fish
test_data_model_glm <-  
  glmmTMB(rewarding.object.preference ~  
            trial * rewarding.object.colour + (1|id) + 
            diag(0 + rewarding.object.colour:trial |id), 
  data = test_data, 
  family = gaussian
  )


Results
Table 1: Summary table for Model 3. Estimates ± standard error (SE) of the effects of trial and rewarding object colour on the rewarding object preference from the generalized linear mixed effect model containing the effects Trial, Rewarding object colour, and their interaction effect (Trial X Rewarding object colour).
Factor Estimate Std. Error T statistic P value
Intercept 1.842 3.011 0.612 0.541
Trial 17.697 10.017 1.767 0.077
Rewarding object colour 1.645 5.542 0.297 0.767
Rewarding object colour X Trial 65.351 20.832 3.137 0.002

When comparing the initial and final preference test, both conducted after training and without food rewards present, we found a significant interaction effect between test and rewarding object colour (p = 0.002). Guppies that had been green-rewarded had a shift in their rewarded object preference that was on average 65.4 seconds stronger than the shift in rewarded object preference of guppies trained to blue (Figure 3). These results were unaffected by the removal of one fish that did not feed during training (See ESM model 2).

Changes in object preference from an initial test before training to a final test after training. During training, fish were rewarded for approaching the blue object (blue squares and lines) or the green object (green squares and lines). At test, no food reward was present. Dashed line represents an equal preference for either object. Data are means ± 95% CI; lighter points and lines are data for each individual. The final preference for green-rewarded fish is stronger than that of fish that has been rewarded for approaching the blue object.

Figure 3: Changes in object preference from an initial test before training to a final test after training. During training, fish were rewarded for approaching the blue object (blue squares and lines) or the green object (green squares and lines). At test, no food reward was present. Dashed line represents an equal preference for either object. Data are means ± 95% CI; lighter points and lines are data for each individual. The final preference for green-rewarded fish is stronger than that of fish that has been rewarded for approaching the blue object.


Post-hoc Comparisons

To determine whether the means of the final rewarding object preference for the two treatments were different I conducted post-hoc comparisons with the package emmeans. I compared the following means:

  • Final test blue-trained and initial test blue-trained
  • Final test green-trained and initial test green-trained
  • Final test green-trained and final test blue-trained
  • Initial test green-trained and initial test blue-trained
test_data_model_emmeans <- emmeans(test_data_model_glm,
  specs = ~ trial:rewarding.object.colour
)

# Making vectors to represent means of interest from emmeans() output
blue0 <- c(1, 0, 0, 0)
blue21 <- c(0, 1, 0, 0)
green0 <- c(0, 0, 1, 0)
green21 <- c(0, 0, 0, 1)

# Set seed to prevent confidence intervals from changing when code is re-run
set.seed(123)

custom_contrasts <- contrast(test_data_model_emmeans,
  method = list(
    "21 blue - 0 blue" = blue21 - blue0,
    "21 green - 0 green " = green21 - green0,
    "21 green - 21 blue" = green21 - blue21,
    "0 green - 0 blue" = green0 - blue0
  ), adjust = "mvt"
) %>%
  summary(infer = TRUE)


Results
Table 2: Table of post-hoc tests with a multivariate-t adjustment for multiple comparisons of a selected set of means. The numbers represent the initial test (0) and the final test (21). The colour corresponds to the identity of the rewarding object (blue for blue-rewarded guppies, green for green-rewarded guppies). Values are all rounded to 3 decimal places. CL = confidence limit.
Contrast Estimate Lower CL Upper CL df P Value
21 blue - 0 blue 17.697 -7.968 43.363 34 0.251
21 green - 0 green 83.048 36.247 129.849 34 < .001
21 green - 21 blue 66.996 15.542 118.449 34 0.007
0 green - 0 blue 1.645 -12.555 15.845 34 0.988

Post-hoc comparisons (Table 2) reveal that initially, before training, there was no significant difference in the strength of preference for the rewarded object between the treatments (p = 0.988). The shift in rewarded object preference between the initial and final preference tests was significant for green-trained guppies but not for blue-trained guppies: green trained guppies increased their preference for the green object by 83 seconds from initial to final (p < .001) test whereas blue-trained guppies increased their preference for the blue object by 17.7 seconds an effect that was not statistically significant (p = 0.251). At final test, green-rewarded guppies had a significantly stronger preference for the previously rewarded object compared to the blue-rewarded guppies (p = 0.007).


Model 4 - Is there a difference in feeding attempts between treatments?

A discrepancy in reinforcement between treatments may influence performance on a final preference test. To see whether there was a difference in feeding between treatments I counted the number of trials in which an individual fish ate throughout all of training and compared the feeding counts between treatments. To do this I fit a generalized linear model with a negative binomial distribution. The response variable ‘feeding count’ is a sum of the number of trials in which a guppy ate.

  • Response variable: feeding.count is the number of trials in which an individual fish ate
  • Fixed effect: rewarding.object.colour is the identity of the rewarding object (blue or green)
Model
feeding_data_model <-
  glm.nb(feeding.count ~ rewarding.object.colour,
    data = my_feeding_data
  )


Results
Factor Estimate Std. Error T statistic P value
Intercept 2.407 0.195 12.351 0.000
Rewarding object colour 0.071 0.263 0.270 0.787

We found no significant difference in the number of trials individuals fed between green-rewarded and blue-rewarded fish (Figure 4, p = 0.787). Removing the one individual that did not feed during training gave a similar result (p = 0.87). We also incorporated feeding count as a covariate in Model 3, finding the same pattern of results (See ESM Model 2).

Average number of trials in which a fish fed during training. Data are means ± 95% confidence intervals with probability density functions of the data (showing the negative binomial distribution) to the right of the raw data.

Figure 4: Average number of trials in which a fish fed during training. Data are means ± 95% confidence intervals with probability density functions of the data (showing the negative binomial distribution) to the right of the raw data.


ESM Models

In this section we describe models not included in the main text.

Addressing feeding count

The models described in this section were run to address the potential role of feeding count on test performance. The concern was that the results may be explained by differential levels of reinforcement between the groups or between individuals during training. To ensure our results were robust to matters arising from feeding count variation we:

  1. Removed an individual that did not feed and re-ran the analysis in Model 3
  2. Included feeding count as a co-variate and re-ran the analysis in Model 3


ESM Model 1 - Removing individual that did not feed

There is a fish that did not eat during any trials, however removing this individual does not change the conclusions

test_feeding_data_low_feeders_removed <-
  test_feeding_data %>% filter(feeding.count > 0)
Model
test_feeding_data_low_feeders_removed_model <-
  glmmTMB(rewarding.object.preference ~
  trial * rewarding.object.colour + (1 | id) +
    diag(0 + rewarding.object.colour:trial | id),
  data = test_feeding_data_low_feeders_removed,
  family = gaussian
  )


Results
Factor Estimate Std. Error T statistic P value
Intercept 0.445 3.004 0.148 0.882
Trial 24.090 9.694 2.485 0.013
Rewarding object colour 3.042 5.538 0.549 0.583
Rewarding object colour X Trial 58.958 20.679 2.851 0.004

ESM Model 2 - Including feeding count as a covariate

With this next model we looked to see whether including the amount of trials an individual fed as a covariate in the model changes the conclusions.

  • Response variable: rewarding.object.preference is the time (seconds) spent near the rewarding object subtracted by the time spent near the unrewarding object
  • Fixed effect: rewarding.object.colour is the identity of the rewarding object (blue or green)
  • Fixed effect: trial is the number of the training trial. In this model it is supplied as an integer
  • Random effect: id is the identity of the individual fish
  • Covariate: feeding.count is the number of trials in which an individual fish ate
test_data_feeding_controlled_model <-
  glmmTMB(rewarding.object.preference ~
  trial * rewarding.object.colour + feeding.count + (1 | id) +
    diag(0 + rewarding.object.colour * trial | id),
  data = test_feeding_data,
  family = gaussian
  )


Results
Table 3: Summary table for a modification of Model 3. This model is the same as that described in Table S1 except it includes feeding count as a covariate. Estimates ± standard error (SE) of the effects of trial and rewarding object colour of the rewarding object colour from the generalized linear mixed effect model containing the effects (Trial, Rewarding object colour, and their interaction effect).
Factor Estimate Std. Error T statistic P value
Intercept -3.642 7.215 -0.505 0.614
Trial 17.697 9.553 1.852 0.064
Rewarding object colour 1.241 5.514 0.225 0.822
Feeding count 0.494 0.579 0.854 0.393
Rewarding object colour X Trial 65.352 19.847 3.293 < .001

The main results do not change if I control for feeding count. The above table is the output feeding controlled model. Below we have the output table for the non-feeding-count controlled model from Model 3.

Factor Estimate Std. Error T statistic P value
Intercept 1.842 3.011 0.612 0.541
Trial 17.697 10.017 1.767 0.077
Rewarding object colour 1.645 5.542 0.297 0.767
Rewarding object colour X Trial 65.351 20.832 3.137 0.002

In both models the p-values are similar. While the effect of feeding count is not significant (p = 0.393) the effect of feeding count trends in the expected direction in our feeding count controlled model. As feeding count increases the preference for the rewarding object colour also increases.


Preference as a percentage

A reviewer of our manuscript asks

Why the authors did not (also) exploit a percentage preference, which is commonly used?

To determine whether our results are robust despite our different measure we also ran an analysis with the percentage preference as ESM Model 3. The results from the main experiment remain unchanged when we do this.

ESM Model 3 - Percentage preference for the rewarded object during testing depending on treatment

Since our data are continuous proportions we used a generalized linear mixed effect model with a beta distribution.

prop_test_data_model_glm <-  
  glmmTMB(rewarding.object.preference.prop ~  
            trial * rewarding.object.colour + (1|id), 
  data = test_data, 
  family = beta_family(link="logit")
  )
simulationOutput <- simulateResiduals(
  fittedModel = prop_test_data_model_glm,
  n = 1000
)
plot(simulationOutput)


Results
Table 4: Summary table for ESM Model 3. Estimates ± standard error (SE) of the effects of trial and rewarding object colour on the rewarding object preference from the generalized linear mixed effect model containing the effects Trial, Rewarding object colour, and their interaction effect (Trial X Rewarding object colour).
Factor Estimate Std. Error T statistic P value
Intercept 0.096 0.191 0.504 0.615
Trial 0.162 0.271 0.599 0.549
Rewarding object colour -0.070 0.259 -0.269 0.788
Rewarding object colour X Trial 1.025 0.379 2.702 0.007
Changes in proportional object preference from an initial test before training to a final test after training. During training, fish were rewarded for approaching the blue object (blue squares and lines) or the green object (green squares and lines). At test, no food reward was present. Dashed line represents an equal preference for either object. Data are means ± 95% CI; lighter points and lines are data for each individual. The final preference for green-rewarded fish is stronger than that of fish that has been rewarded for approaching the blue object.

Figure 5: Changes in proportional object preference from an initial test before training to a final test after training. During training, fish were rewarded for approaching the blue object (blue squares and lines) or the green object (green squares and lines). At test, no food reward was present. Dashed line represents an equal preference for either object. Data are means ± 95% CI; lighter points and lines are data for each individual. The final preference for green-rewarded fish is stronger than that of fish that has been rewarded for approaching the blue object.

Post-Hoc Comparisons
prop_test_data_model_emmeans <- emmeans(prop_test_data_model_glm,
  specs = ~ trial:rewarding.object.colour, type = "response"
)

# Set seed to prevent confidence intervals from changing when code is re-run
set.seed(123)

prop_custom_contrasts <- contrast(prop_test_data_model_emmeans,
  method = list(
    "21 blue - 0 blue" = blue21 - blue0,
    "21 green - 0 green " = green21 - green0,
    "21 green - 21 blue" = green21 - blue21,
    "0 green - 0 blue" = green0 - blue0
  ), adjust = "mvt"
) %>%
  summary(infer = TRUE)
Results
Table 5: Table of post-hoc tests with a multivariate-t adjustment for multiple comparisons of a selected set of means. The numbers represent the initial test (0) and the final test (21). The colour corresponds to the identity of the rewarding object (blue for blue-rewarded guppies, green for green-rewarded guppies). Values are all rounded to 3 decimal places. CL = confidence limit.
Contrast Odds ratio Lower CL Upper CL df P Value
21 blue / 0 blue 1.176 0.588 2.355 38 0.913
21 green / 0 green 3.278 1.659 6.478 38 < .001
21 green / 21 blue 2.600 1.277 5.291 38 0.005
0 green / 0 blue 0.933 0.481 1.809 38 0.991

Feeding and object preferences

This is a model I (Wyatt) ran just because I was curious.

ESM Model 4 - Preference for the rewarded object during training based on feeding

This model asks whether the time spent near the rewarding object during a training session is influenced by whether a fish ate or not. Here trial is a variable containing the training trials (1-20). It is supplied as a random effect along with individual ID in the model.

  • Response variable: rewarding.object.preference is the time (seconds) spent near the rewarding object subtracted by the time spent near the unrewarding object
  • Fixed effect: ate is a binary factor (Yes or No) and corresponds to whether the fish ate in a given trial
  • Fixed effect: trial is the number of the training trial. In this model it is supplied as an integer
  • Random effect: id is the identity of the individual fish
Model
training_data_model_rewarding_object <-
  lmer(rewarding.object.preference ~ ate + (1 | id) + (1 | trial),
    data = training_data
  )

There is a significant deviation from uniformity as indicated by the significant Kolmogorov-Smirnov test. However, this model has a particularly large sample size (n = 456) so even slight deviations will be significant. Looking at the effect size of the deviation (D = 0.0694912) shows that it is minor (D < 0.1) and visual inspection does not suggest large deviations in the residuals so our model is still appropriate.


Results
Factor Estimate Std. Error T statistic df P value
Intercept 42.532 11.918 3.569 36.95 0.001
Ate 91.185 10.953 8.325 223.75 < .001

Throughout all of training, fish that ate during a session spent on average 91.2 more seconds near the rewarded object compared to fish that did not eat (Figure, 6, p < .001). Fish that spent more trials eating may therefore have received more reinforcement for the object-food association. This result is what led me to investigate whether there was a difference in food reinforcement between the two treatments in Model 4.

Preference for the rewarding object during training based on whether an individual ate during a trial or not. Dashed line represents the no preference value. In panel A, Data are means ± 95% CI. In panel B, the data consist of all the feeding choices of all individuals across all training trials. Data are means ± 95% CI with probability density functions to the right of the raw data.

Figure 6: Preference for the rewarding object during training based on whether an individual ate during a trial or not. Dashed line represents the no preference value. In panel A, Data are means ± 95% CI. In panel B, the data consist of all the feeding choices of all individuals across all training trials. Data are means ± 95% CI with probability density functions to the right of the raw data.


Tools used and References

A complete list of the tools used is produced below

Package Version Reference
broom 0.5.5 David Robinson and Alex Hayes (2020). broom: Convert Statistical Analysis Objects into Tidy Tibbles. R package version 0.5.5. https://CRAN.R-project.org/package=broom
broom.mixed 0.2.6 Ben Bolker and David Robinson (2020). broom.mixed: Tidying Methods for Mixed Models. R package version 0.2.6. https://CRAN.R-project.org/package=broom.mixed
carData 3.0.3 John Fox, Sanford Weisberg and Brad Price (2019). carData: Companion to Applied Regression Data Sets. R package version 3.0-3. https://CRAN.R-project.org/package=carData
cowplot 1.0.0 Claus O. Wilke (2019). cowplot: Streamlined Plot Theme and Plot Annotations for ‘ggplot2’. R package version 1.0.0. https://CRAN.R-project.org/package=cowplot
DHARMa 0.3.3.0 Florian Hartig (2020). DHARMa: Residual Diagnostics for Hierarchical (Multi-Level / Mixed) Regression Models. R package version 0.3.3.0. http://florianhartig.github.io/DHARMa/
dplyr 1.0.3 Hadley Wickham, Romain François, Lionel Henry and Kirill Müller (2021). dplyr: A Grammar of Data Manipulation. R package version 1.0.3. https://CRAN.R-project.org/package=dplyr
effects 4.1.4 John Fox and Sanford Weisberg (2019). An R Companion to Applied Regression, 3rd Edition. Thousand Oaks, CA http://tinyurl.com/carbook
emmeans 1.5.1 Russell Lenth (2020). emmeans: Estimated Marginal Means, aka Least-Squares Means. R package version 1.5.1. https://CRAN.R-project.org/package=emmeans
ggplot2 3.3.3 H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016.
ggpubr 0.2.5 Alboukadel Kassambara (2020). ggpubr: ‘ggplot2’ Based Publication Ready Plots. R package version 0.2.5. https://CRAN.R-project.org/package=ggpubr
glmmTMB 1.0.0 Mollie E. Brooks, Kasper Kristensen, Koen J. van Benthem, Arni Magnusson, Casper W. Berg, Anders Nielsen, Hans J. Skaug, Martin Maechler and Benjamin M. Bolker (2017). glmmTMB Balances Speed and Flexibility Among Packages for Zero-inflated Generalized Linear Mixed Modeling. The R Journal, 9(2), 378-400.
knitr 1.30 Yihui Xie (2020). knitr: A General-Purpose Package for Dynamic Report Generation in R. R package version 1.30.
lme4 1.1.21 Douglas Bates, Martin Maechler, Ben Bolker, Steve Walker (2015). Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software, 67(1), 1-48. doi:10.18637/jss.v067.i01.
lmerTest 3.1.1 Kuznetsova A, Brockhoff PB, Christensen RHB (2017). “lmerTest Package:Tests in Linear Mixed Effects Models.” Journal of StatisticalSoftware, 82(13), 1-26. doi: 10.18637/jss.v082.i13 (URL:https://doi.org/10.18637/jss.v082.i13).
magrittr 2.0.1 Stefan Milton Bache and Hadley Wickham (2020). magrittr: A Forward-Pipe Operator for R. R package version 2.0.1. https://CRAN.R-project.org/package=magrittr
MASS 7.3.51.4 Venables, W. N. & Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth Edition. Springer, New York. ISBN 0-387-95457-0
Matrix 1.2.18 Douglas Bates and Martin Maechler (2019). Matrix: Sparse and Dense Matrix Classes and Methods. R package version 1.2-18. https://CRAN.R-project.org/package=Matrix
R 3.6.2 R Core Team (2019). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.
report 0.2.0 Makowski, D., Ben-Shachar, M.S., Patil, I. & Lüdecke, D. (2020). Automated reporting as a practical tool to improve reproducibility and methodological best practices adoption. CRAN. Available from https://github.com/easystats/report. doi: .
tidyext 0.3.6 Michael Clark (2021). tidyext: Tidy Extensions for Data Processing. https://m-clark.github.io/tidyext, https://github.com/m-clark/tidyext.
tidyr 1.0.2 Hadley Wickham and Lionel Henry (2020). tidyr: Tidy Messy Data. R package version 1.0.2. https://CRAN.R-project.org/package=tidyr